PERCEPTUAL TIME−VARYING MODELLING OF SPEECH SIGNALS FOR ASR COMPRESSION APPLICATION (MonAmOR3)

نویسندگان

Amir Leibman

Ilan D. Shallom

چکیده

Perceptual audio coders and Automatic Speech Recognition (ASR) systems are commonly based on short−time analysis. This paper presents a generalized model for time−varying coefficients based on psychoacoustic properties of the human ear. The proposed model is evaluated in the framework of speaker independent speech recognition using Hidden Markov Models (HMM). The generalized model is compared to the traditional most popular MFCC. The comparison is made with respect to the models baud rate and the total error rate measured in an extensive Speech recognition experiment. The recognition based on the well established speech recognition development environment, the HTK and using the TIDIGIT as the evaluation database. The time varying model achieves better recognition rate in comparison to MFCC, while the proposed model baud rate is about one third of the baud rate that is used in the case of MFCC. In addition, a preliminary evaluation of the model robustness to noise was carried out and is presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise Robustness of Traditional Features for Macedonian Voice Dialing ASR

Automatic Speech Recognition Systems of today are intensely deployed in real world application scenarios which are often characterized by suboptimal operating conditions. Thus their noise robustness has become a crucial parameter when assessing ASR in-field performance. The paper examines the noise robustness of traditional ASR feature sets as applied to a Voice Dialing Application built for Ma...

متن کامل

A Novel Algorithm of Sparse Representations for Speech Compression/Enhancement and Its Application in Speaker Recognition System

This paper proposes sparse and redundancy representation spectral domain compression of the speech signal using novel sparsing algorithms to the problem of speech compression (SC)/enhancement (SE). In Automatic Speaker Recognition (ASR) sparsification can play a major role to resolve big data issues in speech compression and its storage in the database, where the speech signal can be uncompress...

متن کامل

Progresses in continuous speech recognition based on statistical modelling for romanian language

In this paper we will present progresses made in Automatic Speech Recognition (ASR) for Romanian language based on statistical modelling with hidden Markov models (HMMs). The progresses concern enhancement of modelling by taking into account the context in form of triphones, improvement of speaker independence by applying a gender specific training and enlargement of the feature categories used...

متن کامل

Continuous-time models for AM-FM signal demodulation and their application to speech recognition

Automatic speech recognition (ASR) systems can benefit from including into their acoustic processing part new features that account for various nonlinear and time-varying phenomena during speech production. In this paper, we develop robust continuoustime expansions used to demodulate the instantaneous amplitudes and frequencies of the speech resonances and extract novel acoustic features from s...

متن کامل

Plasticity in Systems for Automatic Speech Recognition: A Review

Although the topic ‘plasticity in speech perception’ is primarily concerned with the malleability of human speech perceptual behaviour, it may be illuminating to consider in parallel the degree to which current state-of-the-art ‘automatic speech recognition’ (ASR) systems also change their behaviour over time. This paper provides a review of the computational mechanisms underlying contemporary ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

PERCEPTUAL TIME−VARYING MODELLING OF SPEECH SIGNALS FOR ASR COMPRESSION APPLICATION (MonAmOR3)

نویسندگان

چکیده

منابع مشابه

Noise Robustness of Traditional Features for Macedonian Voice Dialing ASR

A Novel Algorithm of Sparse Representations for Speech Compression/Enhancement and Its Application in Speaker Recognition System

Progresses in continuous speech recognition based on statistical modelling for romanian language

Continuous-time models for AM-FM signal demodulation and their application to speech recognition

Plasticity in Systems for Automatic Speech Recognition: A Review

عنوان ژورنال:

اشتراک گذاری